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ABSTRACT 


It occasionally happens in economic analyses that the correctly 
specified model contains variables for which no observed data has been 
collected. When the data in a linear regression model are cross- 
sectional it is possible, under certain conditions on the nature of the 
variables, to estimate the independent effects of a specific set of 
explanatory variables on the dependent variable. A procedure for doing 
this is presented. 

A commonly used model of reenlistment behavior, for which the data 
base is cross-sectional, satisfies the requisite conditions. This 
permits the estimation of the independent effect of the military wage 


on reenlistment rate, as an illustration of the proposed procedure. 
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I. INTRODUCTION 


A. PRELIMINARY 

There is currently some concern about the enlistment and retention of 
men to serve in the armed forces in a draft-free environment. In defining 
the problem to be resolved, a number of studies (notably [1]) have attempt- 
ed to describe the factors which affect enlistment and reenlistment 
behavior. A large part of this interest is directed toward the determina- 
tion of a military Wage structure which will ensure that civilians wil] 
enlist, and that servicemen will reenlist, in sufficient numbers to meet 
service manpower requirements. This paper will concentrate on a part of 
this latter problem. Specifically, the purpose here is to estimate the 
elasticity of reenlistment rate with respect to military wage for first- 
term reenlistees in the Navy. Though Studies of this kind have already 
been conducted, there are a number of reasons for additional study. Among 
them is that a new source of data (previously unused data in the form of 
BuPers Report ED198A for fiscal years 1964 through 1970) is used here, 
which is more complete than that used in prior studies. As a consequence 
of the availability of the new data, some omissions of previous studies 
may be corrected. But, most importantly, a somewhat novel procedure is 
used to estimate the parameter of interest in what will later be introduced 


as the reenlistment model. 


B. BACKGROUND; DESCRIPTION OF THE DATA 

In the past, extensive reliance has been placed in the technique of 
gathering information about reenlistment behavior by the use of surveys 
over potential reenlistees. This technique depends on before-the-fact 


information, which is in the form of the stated intentions of men facing 








the decision to reenlist. Typically these surveys seek to determine, by 
means of a question and response approach to the subjects, the factors 
which affect the reenlistment decision, and thus have value in indicating 
the lines along which quantitative research should be performed. That is, 
they serve primarily to identify those factors which should enter into an 
analytic model of reenlistment behavior. But once such a model is 
constructed, reliable quantitative results can only be obtained by investi- 
gating the observed behavior of potential reenlistees. This after-the-fact 
information, the revealed reenlistment behavior, is provided by the newly 
available data used in this paper. 

Data extracted from BuPers Report EDI98A for use here have the form of 
pooled time series and cross-sectional information. In particular, the 
numbers of men eligible to reenlist and the numbers of these that do in 
fact reenlist are provided for each combination of 
(1) Pay grade: E-1 through E-9 
(2) Rate (a Navy skill or job specialty classification): BM, QM, ST, TM, 
Bei, Eis DS, AT, AX, AQ, TD, SM, RD, RMy Ci, AC, PT, HM, DT, DM, MU, 
mee ra, PH, YN, PN, DP, SK, DK, JO, PC, AK, AZ, GM, MN, IM, OM, EN, BT, 
muet. CM; AD, AO, AB, AE, AM, PR, LI, MR, SF, DC, PM, ML, CE, EO, Bs 
Samer, CS, SH, SD, MM, AV, SP, BR, EQ, CU, SO, AW, AS. 

(3) Mental Group: I, II, upper III, lower III, IV. 

(4) Fiscal year of reenlistment: 1964 through 1970. First-term reenlist- 
ments only are considered. (First-term reenlistments are those of 
servicemen completing their initial term of active obligated service.) 
Reenlistments beyond the first term are considerably less interesting, 
since these advanced-term reenlistments typically involve personnel already 


committed (psychologically) to a Navy career. 








"Mental Group," a designation akin to IQ that is applied to enlisted 
personnel, is determined by testing as is intelligence quotient. As such 
it is not likely to be highly reliable. Aside from the facility with 
which personnel in the higher mental groups may enter certain more tech- 
nical Rates, and the fact that it may be significant for an enlisted man 
who wishes to become an officer candidate, there is no special advantage 
or disadvantage accrued by designation as a member of any particular men- 
tal group. On the contrary, there is possibly even a tendency on the 
part of a certain group of men to score poorly, purposely, in the testing. 
This group would consist of some of the personnel of better than average 
education who have enlisted in the Navy, during the past few years of a 
high level of military activity in Vietnam, to fulfill military service 
obligation and to avoid more hazardous duties. It is likely that some 
part of this group, in merely wishing to serve their required time in the 
armed forces, would seek to escape prominence in their enlisted service. 
There is, aS a consequence, seemingly little general incentive to score 
well in Mental Group testing. In addition, testing for Mental Group clas- 
sification is subject to the same criticisms that have recently been 
directed at Sm IQ testing: some minority groups may be put at a 
disadvantage by the biased (toward comprehensibility by white mid-Americans) 
nature of the test. In any case, classification by Mental Group is cer- 
tainly less reliable than cross-sectional classification by pay grade or 
Rate, or time series classification by fiscal year of reenlistment. Asa 
consequence, the Mental Group classification will not be of primary interest 
here. 

Certain of the Rates included in the above report are unsuitable for 
inclusion in the analysis. Those Rates that are discarded from the data 


base are AV, SP, BR, EQ, CU, SO, AW, AS, MT, DS and SD. Any Rate not 
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included in the study was disallowed for one of the following reasons: 

1. The Rate consisted of pay grades Е-7 through E-9 only; 

2. The Rate's membership consisted in large part of foreign nationals 

who could be expected to reenlist with high probability; 

3. Data for the Rate were not available for each of the fiscal years 

1964 through 1970. 

The fact that the data consists of a time series of cross-sections of 
revealed reenlistment behavior allows the correction of an omission of 
previous research. To date little effort has been made to establish a 
relationship between the variation over time of reenlistment behavior and 
the variation over time of pecuniary considerations facing the potential 
reenlistee. The time series of cross-sectional data provides a basis on 
which such a relationship can be constructed. The term "constructed" is 
used advisedly, since the pecuniary factors considered here are those 
imbedded in a particular model of reenlistment behavior. 

Another disadvantage of previous research has been that pecuniary 
factors for potential reenlistees have only been considered in coarse de- 
tail. The minuteness of the new cross-sectional data, on the other hand, 
permits a more precise formulation of the economic factors that face the 
individual potential reenlistee. These factors vary from man to man; they 
are dependent on the individual's level of proficiency (pay grade), job 
specialty (Rate), and fiscal year in which the reenlistment decision is 


made. 
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Il. THEORY UNDERLYING THE REENLISTMENT MODEL 


A. FOUNDATION 

The aim in this paper is to determine the rate of change of first- 
term Navy reenlistments with respect to the rate of change in military 
compensation. Toward this end a model is presented to describe 
reenlistment behavior, quantitatively represented by reenlistment rate, 
in terms of those variables which affect the reenlistment decision. 
Then, using ae model as a basis the pure effect of the military wage 
on reenlistment rate is determined. Necessarily, the influence of all 
other variables must be removed in order to estimate the independent 


effect of the military wage. 


B. TASTE AND OPPORTUNITY FACTORS. 

Consider an individual who is eligible to reenlist. The variables 
which affect nis decision may be aggregated into three broad categories: 
pecuniary, personal non-pecuniary and general non-pecuniary. The first 
two of these categories are of interest in this section (the final 
category is discussed later). Within the first category are all 
factors which reflect opportunity (monetary) considerations. It 
includes such variables as expected basic military wage, benefits to 
servicemen which may be expressed equivalently in monetary terms, and 
the alternative civilian wage. Elements in the personal non-pecuniary 
class include such factors as military job satisfaction, agreeability 
with the quality of home life offered by Navy service, adaptability 
to the military hierarchy, and АА towards sea or shipboard 


duty. Variables which are described as non-pecuniary are difficult to 
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quantify. However, by employing the concept of reservation wage (for 
a more complete discussion, see, for example, Gray [2]), the effect of 
these purely individual non-pecuniary factors on the reenlistment deci- 
sion can be incorporated in a variable with analytic expression. The 
My ino phrase "purely individual" is to be stressed. Just as 
factors which affect the reenlistment decision and which are unique to 
each individual can be identified, so can be recognized non-pecuniary 
factors affecting the reenlistment decision which are unique to each 
Rate, or to each pay grade, or to each year. Variables of this sort 
are the general non-pecuniary factors and will be introduced and 
treated later. This is accomplished by considering the pecuniary 
compensation that will just induce an individual to reenlist. The 
variables in the class of personal non-pecuniary factors can be viewed 
as elements which contribute to the determination of the value of 
compensation required to induce reenlistment. Knowledge of this level 
of compensation for an individual makes knowledge of the personal non- 
pecuniary factors affecting his reenlistment behavior redundant (at 
least in a study where interest centers on macroscopic reenlistment 
behavior). As a consequence, the personel non-pecuniary variables 
need not be explicitly considered! since they are imbedded into the 
individual's reservation wage, which will now be defined. Suppose 
that an individual deliberating reenlistment is capable of estimating 


the expected present value of his alternative courses of action: to 


1тһ1$ is an advantage of the use of data describing revealed reenlist- 
ment behavior: and individual's personal non-pecuniary attitudes are 
inconsequential; the fact of his reenlistment displays that any 
personal dislikes of the service were overcome by sufficient 
compensation. 


t 





reenlist or not to reenlist, Let WM represent the present value of all 
pecuniary returns if his choice is to reenlist, and let WC represent 
the present value of all pecuniary returns if he chooses not to reenlist. 
WM consists of two types of pecuniary returns. Most obviously there are 
those whose dollar value is fixed and is not subject to individual 
interpretation: basic pay, variable reenlistment bonus, basic allow- 
ance for subsistence, clothing allowance. There are also pecuniary 
returns whose dollar value is in large part subjectively determined by 
the individual: free medical services for the serviceman and his 
dependents, Navy exchange and commissary privileges and others. This 
distinction is not negligible, and will be treated explicitly later. 

For a serviceman on active duty, the determination of WC is not as 
straightforward as that of WM. Typically the serviceman may have little 
more than a rough estimate, in the year in which the reenlistment 
decision is made, of the mean wage received by civilians working in a 
job category similar to that of the serviceman and located in the geo- 
graphical area of interest to him. Now define Ue as the relative wage. 
Then the reservation relative wage is defined as the value of the above 
ratio which will just induce the serviceman to reenlist. The individual 
will reenlist if his actual relative wage is greater than or equal to 
his reservation relative wage. Similarly, among the entire cohort of 
eligible reenlistees, those that reenlist will be those whose actual 
relative wage is greater than or equal to their reservation relative 
wage. Now consider the domain of possible values of reservation rela- 
tive wage. For each number inthis domain, some portion of the eligible 
population will reenlist. As a consequence, the reenlistment rate 


(over the eligible population) has some functional expression over the 
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domain of reservation relative wage. This introduces a variable of 
fundamental importance in constructing an analytic expression for 
reenlistment rate. 

The form of the functional dependence will be discussed later. It 
is worth noting here than an individual's reservation relative wage is 
some fixed value of the ratio a Presumably, an individual consider- 
ing reenlistment is able to estimate the expected present value of 
pecuniary returns for not reenlisting, so his reservation relative wage 
can be equivalently expressed as the ratio of a sufficiently large value 
of expected present value of returns for reenlisting to his estimate of 
returns for not reenlisting. This says of course that for each 
individual the reservation wage uniquely determines a value of WM 
sufficiently large to induce reenlistment. As a consequence reenlist- 
ment rate, for fixed WC, has a functional representation over the 
domain of WM: for each value of WM a certain fraction of the eligible 
population with given WC will reenlist. The implications of these 
obvious comments are meant as a preliminary to later work. In order to 
assure proper statistical control of the variables in the model, it is 
necessary to be able to match observations of reenlistment rate with 
corresponding relative wage. That is, a particular set of men eligible 
to reenlist faces a given relative wage (the members of this set who 
reenlist in the face of this relative wage are those for whom this 
relative wage is the reservation relative wage). This set of men 
eligible to reenlist must be identifiable, for each observed relative 
wage, in order to be able to perform significant statistical analysis. 
By the preceeding remarks, an equivalent necessary condition for proper 
statistical control is that for any Fixes value of WC it is possible to 
identify the set of men eligible to reenlist which corresponds to any value 


15 





of WM. Or, for any value of WC and any value of WM, it is necessary to 
be able to identify the appropriate corresponding eligible population. 
Now just as the purpose of this section was to eliminate the necessity 
of identifying, and including in the model, variables which are in the 
class of personal non-pecuniary factors, a purpose of later section 
will be to remove the requirement that the value of WC for a potential 
reenlistee be known. What will in effect be accomplished is that the 
variable WC will be removed from the model, so that a correspondence 
between reenlistment rate and WM only need be made in order to satisfy 
the functional requirement that reenlistment rate depends on relative 
wage and the statistical requirement that the appropriate eligible 
population be identifiable for given WM and WC. 
C. THE REENLISTMENT MODEL IN CROSS-SECTION AND TIME SERIES; OTHER 

FACTORS AFFECTING REENLISTMENT RATE 

In the preceeding section, a model of the form R = f(WM/WC) was 
postulated, where WM and WC are as previously defined and R represents 
reenlistment rate. Fisher [3] and [4] first concluded that a model of 
the form R = f(In (WM/WC)) was indicated. Specifically, Fisher concluded 
that the appropriate model was expressed by: 

R= oa + в In (WM/WC) + e, 

a linear expression for R in In(WM/WC), with disturbance term e. Later 
work, for example Nelson [5], employed a relation of the form: 
(a) JR = oa + 8 In(WM/WC) * Z * e, 
where the term Z represents an additional set of variables which are 


included in the model. The variables in Z depend, of course, on the 
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author of the study employing the model. A similar model in Logit^ form, 


(b) In (E о 
has also been considered by, for example, Gray [2] and Wilburn [6]. 

In this paper models of both forms (a) and (b) will be considered 
for comparative purposes. Note that equations (a) and (b) may be 


rewritten as: 


E» InR = a + 8 InWM - 8 InWC + Z + e, 
(b') In (Er) = u + 8 l|nWM- g InWC * Z * e. 
Or: 
(a") А = а! (yz р 
e) e 
where: 
a' = exp(a), Z' = expíZ), and e' = exple). 


These equations imply that, depending on which of the models (a) or 
(b) is used, either In R or (7) is linear in the natural log of 
the ration WM/WC (neglecting for the moment the effect of the variables 
in Z). The implicit assumption is made, then, that the potential 
reenlistee values the dollars in WM and in WC in constant ratio. That 
is, the potential reenlistee is indifferent to an equal percentage 
change in WM and in WC: his reenlistment decision remains the same 
whether the relative wage offered him is the ratio WM, /WC, > or the 
ration (1 + a)WM, / (1 + a)WC, , for any a (a may be positive, negative 
or zero, repreenting an increase, decrease or lack of change 
Note that just as reenlistment rate R can be considered to be the 
sample estimate of the probability of reenlisting,the ratio 


R/(1-R) may be interpreted as the sample estimate of the odds of 
reenlisting. 
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respectively in each of WM, and WC). This may not actually reflect the 
candidate reenlistee's utility of dollars in WM and WC. The man may in 
fact value a percentage increase in his civilian alternative wage WC 
more highly (or even less than) the same percentage increase in WM. 

To relieve this possibly erroneous assumption, the following 


revisions to models (a) and (b) will be used: 


B 
(c) R = a' з — 
WC 
B 
R WM 
(d) — a! [s] TE c! 
1-R wc? 


The parameter 5 reflects the possibility that a potential reenlistee 
values a percentage change in WM and the same percentage change in WC 
differently. Presumably, the value of 8 is positive. If this is the 
case, then: if 6 > 1 a percentage change in WC is valued more highly 
than the same percentage change in WM; if 6 = 1 equations (c) and (d) 
become (a) and (b); if O < 8 < l a percentage change in WM is valued 

more highly than the same percentage change in WC; if 8 = O the deci- 
sion to reenlist is independent of the candidate reenlistees civilian 
alternative wage; a value of 6«0 indicates an aversion to civilian 


dollars. These equations may be rewritten as: 


ic ) InR = a + 8 ]nWM + y InWC + Z + е, 
(4') 1n (zy) = а + в ПАМ + y 1niC + 2 + e, 
where: ү = -В8. 


If y » -8, then the equations (c') and (d') become (a') and (b'). 
The coefficient g in the equations (c') and (d') is the parameter 
of interest. In equation (c'), g is the military wage elasticity of 


reenlistment rate since application of the partial differential operator 
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3 to (c'), while neglecting the disturbance term є, yields: 
9(1nR) = g 3(1nWM) + y ə(lnWC) + 32 ; 
or 
dR/R = B(OWM/WM) + y o(1nWC) + 3Z. 
Similarly, in equation (d') 8 represents the elasticity of the odds of 
reenlistment with respect to military wage. 

It is now appropriate to consider some assumptions about the nature 
of the cross-section and time series data. First, consider reenlistment 
behavior of cohorts of eligible reenlistees over time. It seems 
reasonable to assume that an individual deliberating reenlistment is 
unaffected by the past reenlistment behavior of others, and that his 
decision is also unaffected by past values of relative wage. Stated 
equivalently, this assumption is that the model contains no lagged 
values of reenlistment rate or relative wage. This is a simplified 
assumption; it is of course also possible to postulate and use a 
model which contains lagged values of relative wage. Now consider the 
effect of the war in Vietnam on initial enlistments or of general 
civilian unemployment on reenlistments in the Navy. These are examples 
of temporal factors that can be expected to have a significant effect 
on initial enlistments (in the first case) or reenlistments (in the 
second case) in the Navy. It seems reasonable, then, that a variable 
reflecting such temporal factors should be included in the model. 
Similarly, a potential reenlistee who is a member of a certain Rate and 
is in a certain pay grade may be affected by factors peculiar to his 
Rate and pay grade, as well as to factors unique to the year in which 
the reenlistment decision is made. In particular, since enlisted men 
in higher pay grades typically enjoy greater prestige and increased 


personal liberty than men in the lower pay grades, it may be hypothesized 
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that pay grade affects reenlistment rate in ways not expressible in 
terms of pecuniary compensation, as well as in its contribution to WM. 
It cannot, then, be fairly assumed that factors which depend on Rate, 
pay grade or year of eligibility to reenlist do not separately influ- 
ence the reenlistment decision. As a consequence, variables represent- 
ing the influence of such factors will be included in the model. [Such 
variables are, in general, unobservable or not quantifiable. Their 
inclusion in the model is a formalism for the sake of completeness. ] 
These factors are the general non-pecuniary factors whose existence was 
previously hypothesized. 

Note that nothing has yet been said about the influence of Mental 
Group on the reenlistment decision. It seems likely that personnel in 
different Mental Groups will reenlist at different rates. But designa- 
tion of an individual as a member of a particular Mental Group is some- 
what less accurate, hence less meaningful for statistical purposes, 
distinction than classification of personnel by Rate, pay grade or year 
of reenlistment. Additionally WM for a candidate reenlistee does not 
depend on his Mental Group. [An individual's expected WC may, however, 
depend on his Mental Group. If this is the case, it should emerge in 
comparison of results for separate Mental Groups.] Hence, Mental Group 
classification will not be used to define any of the variables of the 
model. Instead, the model to be constructed will be applied to all 
personnel in each of the Mental Groups separately. The results for the 
Mental Groups will then be statistically compared. 

Now consider a potential reenlistee viewing his military and civilian 
pecuniary alternatives. WM depends (in a manner to be made explicit 
later) on his Rate and pay grade and on the year in which his current 


enlistment expires. But typically the potential reenlistee's view of 
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his civilian alternatives is limited; he has been efficiently isolated 
from the civilian world and civilian labor market by the requirements 
of his military service. And, typically, it is likely that he has 
been unable to go job-seeking in the geographical area of interest to 
him for civilian life. So it may be realistic to suppose that the 
alternative civilian wage perceived by the potential reenlistee can be 
considered to be the median wage (or average wage) of the civilian 
population working in his skill category (craftsman, mechanical, elec- 
trical, clerical and so on) in the year in which he is eligible to 
reenlist. This will be taken as a formal assumption: the civilian 
alternative wage perceived by an individual in a given Mental Group 
depends only upon his Rate and the year in which the reenlistment 
decision is made. [This assumption may be faulty in that the alterna- 
tive civilian wage may also depend on the potential reenlistee's 
military pay grade. That is, an advanced rank status in the military 
may promise higher pay in the civilian economy, since it may be 
interpreted as being equivalent to advanced expertise.] 

Since the assumption has been made that variables representing R, 
WM and WC are not lagged in the model, the time series data in R, WM 
and WC may be considered as another cross-section. Make, for the momen 
the stronger assumption that the model contains no lagged variables at 
all. Then the time series, represented by year in which observations 


are made, may be considered as another cross-section. Let the 


This assumption is made for the sake of simplicity of representa- 
tion. Later it will be seen that the assumption is not necessary; 
equivalent results are obtained if it is not made. At the same time 
it will be seen that the analagous assumption for the variables R, 
WM and WC may be weakened somewhat: identical results will be 
AMT even if the model contains lagged values of the variable 
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subscripts i, j and t represent Rate, pay grade and year of reenlistment 


eligibility. Then the equations (c') and (d') can be represented in 
cross-section data as 


(e) In R.., =a + g In WM, . 


pt Ay tb, te. te, 


ijt ’ 





R. ° 
| 
(f) In Ë ) a+ а Тп ИМЕ уам HA RBA eo 


Dt 
where: 


R is observed reenlistment rate for Rate i, pay grade j, year t; 


ijt 

ММ + is military wage for Rate i, page grade j, year t; 

МС. + is alternative civilian wage for Rate ji in year t; 
The variables А;, В.» апа C. represent all factors which influence 
reenlistment in, respectively, Rate i, pay grade j, or year t uniquely; 
T is the disturbance term for the observation of Rijt А.» Bas and 
C. are the variables whose introduction into the model was promised 
earlier. Note that these variables are invariant over subscripts 
not included in their notational expression. For example, the factors 
represented by C, depend only on the year of reenlistment, and are 
invariant over Rate and pay grade. 

Note that a crucial assumption implicit in equations(e) and (f) 


is that the variables Riit and WM are the only variables in the 


ijt 
model which are not invariant over at least one cross-sectional 
dimension (for convenience, the set of all Rates considered in the 
analysis will be referred to as a cross-sectional "dimension"; similarly 
for the set of all years and the set of all pay grades considered). 
Later work relies heavily on this assumption. 

The models represented by equations (e) and (f) seem reasonably 


complete with the introduction of the variables A; D and C. as "catch- 


all" categories to reflect all factors which influence reenlistment 
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depending on Rate, pay grade and year separately. But it is clear that 
the inclusion of these variables creates a problem: quantification of 
E B; and C, is difficult if not impossible. Note that this problem is 
indissoluble. The influence of such variables as C, and ИС. + on the 
decision of a potential reenlistee is almost certainly non-trivial. 
Their effects cannot reasonably be ignored in any rational model of 
first-term reenlistment behavior. One possible approach to resolving 
this problem is to construct a model using dummy variables to represent 
Rate, pay grade and year. But in the face of 61 rates, nine pay grades 
and seven years this may yield results too minutely specialized to be 
interesting unless a certain amount of arbitrary aggregation (over 
Rates, pay grades and years) is done. In any case, an alternative 
procedure for ridding the models (e) and (f) of the effects of the 
variables A., B. and C+ will be used here. Use of this procedure is 


also motivated by a desire to rid the model of the variable WC the 


[ 
civilian alternative wage, the method of measurement of which may be 
subject to dispute. 

To specify the procedure, consider: 


(e) In Ri; = u + 8 1n ИМ i + y In WC., +A; + В, + C, + e 


ice 
in "observed" data. 


Taking the mean, for Rate i and pay grade j , over all years: 


NU In Rig, za +B Ir WMig ty In WC, +A +B +C fej 
Where, for example, 
T 
R.. = — > 
jJ; 2 Rijt 
and | Т 
WC. = = WC. : 
ihe Yee t 


for T = number of years considered in the data. 
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Taking the mean, for Rate i in year t, over all pay grades: 


(e2) In R 


5470+ 8 In WM. ty In t lern ce 


t 
Taking the mean, for pay grade j in year t, over all Rates: 


(e3) 1п А ., = а + в In WM " + y In WC 


+A +B.+C, + 
1 : J 


t t MID 


Taking the mean, for year t, over all Rates and pay grades: 


(e4) In R + = а + 8 1n ИМ Р + y In ИС + +А +В + C. +e + 
Taking the mean, for pay grade j, over all Rates and years: 

. = qt . + I + + B. + + : 
(e5) In К 5, a + B In ИМ, y In WC A В, C. €i. 


Taking the mean, for Rate i, over all pay grades and years: 
(e6) In R. = o + В 1п ИМ. + y In WC, AEE t: 
Taking the grand mean: 


(е7) In R = а + 8 In WM *tylnWC +A +B +C +e 


Adding and subtracting, 
(IN CIA (e3) + (eq) + (e5) + (eo) (e7) 


yields the equation: 


In Rijt - ln E - In R. + - In R it + In Re + 

In Rj. + In R + - ]n R = 

в(1п ММ з - In ИМ, - In WM. + - In WM jt + In WM, + 
In WM E + In WM r In WM )+ 
Gjt "13. “it jt si te j O E. 


A similar result holds for the model represented by equation (f). 

This is the form of the data that will be used in a linear regress- 
ion to estimate the coefficient 8. For want of more convenient termin- 
ology, data in the form above will often be referred to as "normalized 
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data", while the initial values of each In Rist and In WM: t will be 
called the "original data." In addition, the procedure of obtaining 
normalized data from the original data will sometimes be called "the 
model" when no ambiguity is possible. Some features of "the model" in 
this sense are investigated in Section IV. 

Now note that any variable which has fewer than three subscripts in 
its notational expression disappears from the normalized form of the 
data. A little reflection shows that lagged values of any such vari- 
able are also purged in the normalized data. In particular this holds 
for the variable WC... As a consequence, it is only necessary, in 
order to obtain the identical equation in normalized data, to assure 
that the model contains no lagged values of T and WM; st 
The question of the nature of the normalizeddisturbance term: 


E E - te. te. + E 
Dt “ij. Е JC E, 


will be taken up later. 


D. THE CONSTRUCTION OF WM 

The measurement of WM used here is that proposed by Burton C. Gray 
in [13]. 

As mentioned previously, pecuniary compensation for reenlisting can 
be viewed as consisting of two types of remuneration: the actual wage 
received by the reenlistee and the value placed by the reenlistee on 
the peripheral benefits of military service. A component of the actual 
wage received by a reenlistee that is unique to first-term reenlist- 
ments is the Variable Reenlistment Bonus (VRB). This bonus is a multiple 
of the reenlistee's annual base pay (which in turn depends upon pay 
grade) and varies from year to year and from Rate to Rate (depending 


on the valuation placed on reenlistments in a given Rate in a given year). 
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VRB has since fiscal year 1965 been the primary tool used to selectively 
(by Rate) influence reenlistments. Prior to FY 1965 all reenlistees 
received a reenlistment bonus that was a fixed multiple of annual base 
pay. Ideally, one should wish to evaluate the effect of VRB on first- 
term reenlistment behavior. But since the determination of a single 
parameter of interest is intended simply as being illustrative of the 
fundamental goal of this paper, an investigation of the consequences 

of using normalized data, this is not done. VRB enters the construction 
of WM as merely another component. 

Now consider the future of a reenlistee. He can reasonably expect 
promotion to a higher pay grade within his next term of enlistment, with 
a concurrent increase in pay. This expectation obviously influences the 
reenlistment decision (for it can be supposed that fewer men would 
reenlist without the promise of probable advancement in rank), but in 
a way difficult to specify. The simplifying assumption is made that 
this promise of increased future pay offsets the lesser valuation of 
future dollars. That is, in considering the present value of WM, the 
potential reenlistee employs a discount rate of zero. 

A final assumption, due to the nature of the available data base, 
is made. For want of other information, it is assumed that all 
reenlistments are made for an obligation of four years. 

With the preceeding paragraphs in mind, it is possible to postulate 


the following construction: 


WM = 4C +P 


LAR £40 , 


where: for a potential reenlistee WM is the present value of military 
wage for a four-year reenlistment (at a zero discount rate), P is the 


reenlistee's annual base pay, VRB is the appropriate Variable Reenlist- 
ment Bonus multiple, 
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C is a constant representing the monetary valuation of the 
peripheral benefits of military service for a four-year 
reenlistment, 

K is a dimensionless multiplicative constant representing the 
the valuation of those benefits associated with military 
service that can be expected to increase with annual base 
pay. K is intended to reflect such elements as tax 
advantages, allowances and commissary and exchange benefits, 
whose value increases as base pay increases. 

This may be rewritten, for Rate i, pay grade j and year t, as: 


T VRB. «4 


WM.., = 4C +P. a en) 


1J IJL 3 


The construction of WM allows freedom for parameterization of the 
constants C and K. In order to get an idea of the sensitivity of the 
coefficient В to changes in assumed C and K, regression analyses are 


performed for various presumably reasonable values of these constants. 
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III. APPLICATION 


А. PRELIMINARY 

Consider the consequences of applying the natural logarithm trans- 
formation to the variables Ri it and Ri ¡pl OIR 
respective ranges of values of [0,1] and [0, 9), which under the natural 


ip These variables have 
logarithm transformation Бесоте (- ~ ,0] апа (- =, ~). Thus this trans- 
formation avoids the awkward situation of having a finite range of 

values on the dependent variable (in the case of Raat 
regression analysis. But there is a limitation associated with the use 


) in a linear 


of the logarithmic transformation: under this transformation a 
reenlistment rate of zero is undefined. Hence in the model represented 
by equation (e) of the preceeding section, no observations of zero 
reenlistment rate can be allowed. Additionally, in the model represented 
by equation (f), a reenlistment rate equal to one must be disallowed, 
Since this corresponds to an infinitely large value of the odds of 
reenlistment. Accordingly, since it is desirable to use the same data 
base for each of the models (e) and (f), any observations of reenlist- 
ment rate equal to zero or one will be discarded. This is not felt to 
restrict the analysis too severely since reenlistment rates of zero or 
one, the extreme values of the data, typically correspond to extra- 
ordinary classes of reenlistees. In particular, reenlistment rates of 
zero are most common in very low pay grades and reenlistment rates of 
one are usually observed in the highest pay grades. This suggests that 
a zero reenlistment rate can usually be associated with a class of men 
who show an unsuitability for military service, while a reenlistment 


rate equal to one can usually be associated with the class of men who 
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thrive in the military. Neither of these classes is particularly 
interesting for a study of general reenlistment behavior. 

Now suppose that in models (e) and (f) the error terms e; t are 
independent, identically distributed Normal random variables, each with 
mean zero and variance Е Then the application of ordinary least 


squares procedures to estimate the coefficient ß in the normalized form 


of model (e), 


In R. ;, - In Er - |n E MR, FIR n Ss: + 


1) J 
In Se - In RS = 
g(1n LUE - In WM. з, - In WM. + - In WM jt + In WM. 4 


1n Mm; + In ММ + - In WM) + 


eee ee” EP 1 
i L ш se, Fee 


yields an unbiased estimator for this coefficient. The same is true for 
ordinary least squares estimation of ß in the normalized form of model 
(f). These assertions will be proved in Section IV, where it will also 
be shown that the above assumption about the distribution of the 


disturbance terms E 


may be relaxed somewhat. 
B. VALUES FOR PARAMETERIZED C AND K 
Regression analyses were performed for each combination of the 


following selected values of the constants C and K: 


mem к 
500 
1000 0.10 
1500 0.15 
2000 0.20 
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It is felt that these selected values represent a range broad enough to 


include realistic possible values of the constants. 


C. THE REGRESSION ANALYSES 
In addition to estimating the coefficient 8 in the normalized forms 
of the models (e) and (f), it may be interesting (for comparative 


purposes) to estimate 8 in the equations: 


(g) 1п В.., = а + в 1п LER + 


jj A 


R.. 
nn 
(h) nl | o MM WM, re 


mi 2 
ijt ijt 


where it is assumed that the Eijt Š are independent, identically 
distributed Normal random variables with mean zero and variance oo 
Note that these latter equations are truncated forms of the models 


(e) and (f): the variables МС. +» А; 


j B. C. are neglected. 

Four selections for the value of the constant C and three choices 
for the constant K yield 12 different constructions of WM. Regression 
analyses are conducted for each of these constructions of WM, using 
models (e) (normalized), (f) (normalized), (g) and (h) for each of five 
Mental Groups. This produces 240 least squares estimations to be 
considered. Results for one construction of WM for models (e) (normalized), 
(f) (normalized), (g) and (h) and each of the five Mental Group classi- 
fications are looked at in detail in this section. Less detailed 
regression analysis results for the remaining 11 constructions of WM 
are given in Appendix A in tabular form. 

Now consider Table 1, which gives summary results for the construc- 


tion of WM using C = 500 and K = 0.10. Denote Mental Groups I, II, 


upper III, lower III and IV as Mental Groups 1, 2, 3, 4 and 5 respectively. 
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Normalized 
Model (e) 
MG 1 

MG 2 

MG 3 

MG 4 

MG 5 


Normalized 
Model (f) 


MG 1 
MG 2 
MG 3 
MG 4 
MG 5 
Model (g) 
MG 1 
MG 2 
MG 3 
MG 4 
MG 5 
Model (h) 
MG 1 
MG 2 
MG 3 
MG 4 
MG 5 


ГЭЕ го = 


N — N N = 


.17260 
. 76626 
‚84425 
‚34492 
„50907 


‚87660 
: 0 
‚61042 
‚00364 
. 16256 


. 3686] 
. 91656 
258111 
. 44961 
. 94090 


‚65354 
. 70828 
.05608 
-93526 
. 11862 


O O O O © со © = © O СОС O O 


O € OOO O 


SE 


. 26011 
‚ 17863 
‚21828 
:20119 
. 28158 


. 36445 
.24978 
.30134 
‚28072 
. 39745 


.12644 
.09547 
.11230 
.12798 
.14984 


. 1745] 
215999 
„19295 
. 17588 
. 21826 


Table I 


шл OY Со oo + 


. 49983 
90073 
‚44902 
‚68474 
"559 7? 


5.14912 


.89793 


8.66258 


. 13740 


5.44106 


10. 
20. 
14. 


11 


10 


13 
11 
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82445 
07401 
07977 


.3266/ 
10. 


28386 


.62108 
ER 
„44309 
.00301 
10676 


91696 


O (O O © © C c O O © O € OOO O 


O =] — —_ — 


950 
. 15014 
.17024 
. 15299 
3097 


38389 
.29433 
‚32445 
‚29784 
„LOST 


‚59642 
"65793 
.62178 
.62849 
.46204 


.13624 
.33460 
215222 
. 18701 
2899057 


© O O O OQ OO O O © O OO O O © 


SOS OOS © 


‚1904 
‚3070 
12950 
0 
. 2601 


72057 
.3346 
23025 
2798 
.2638 


.3746 
.4927 
.4078 
le 
. 4085 


. 3685 
. 4898 
2022 
‚3620 
. 399] 


720 
1259 
996 
805 
530 


720 
1259 
996 
805 
530 


720 
1299 
996 
805 
530 


720 
1259 
996 
805 
530 





Let Bdenote the estimate for 8, SE represent the standard error of the 
estimate of 8, t represent the computed t-statistic, 5° be the estimate 
of the variance сё, R be the multiple correlation coefficient and N 
represent the number of observations of Riit: [It will be shown in 
Section IV that с? is an unbiased estimator forc*.] Note that the 
computed values of the t-statistic indicate that in each of the twenty 
least squares estimations of 8 represented in Table I the estimated 
coefficient is significantly different from zero. But also note that in 
comparing results for the normalized models (e) and (f) and the corres- 


ponding truncated non-normalized models (g) and (H), the following 


differences are consistently true for each Mental Group: 


lle: The values of computed t-statistic for models (g) and (h) 
are greater than the values for models (e) and (f). 

Zz. The standard error of the esimate is less for models (g) 
and (h) than for models (e) and (f) 

3: The multiple correlation coefficient R is greater for 


models (g) and (h) than for models (e) and (f). 


These considerations might seem to indicate that models (g) and (h) 
fit the data better than the corresponding normalized forms of models 
(e) and (f). But in reality the results 1., 2., and 3. are not particul- 
arly surprising, since the computed value of t is directly proportional 
to, and the computed value of SE inversely proportional to, the square 
root of the sum of squared deviations from the mean of the explanatory 
variable, while ER is inversely proportional to the sum of squared 


deviations from the mean of the dependent variable. That is, for a 


single explanatory variable with observed values ST i= 1, ...n, and 
a dependent variable with observed values Yas een; 
In 
72 
ЅЕ = | ES 
| (X. m X) 


32 





кшй 
Кыыс 
n 
and: D (y, A вх.) 
2 1 
R = 1 = 9 
1 a2 
) (у; - у) 
1 
where: 
yn = т" 
Y omn . ALT E 


B is the estimated regression coefficient, and e is the estimate of 
2 


c . Hence as the sum of squared deviations from the mean of both the 
explanatory variable and the independent variable decrease, it is to be 
anticipated that SE and Ro will increase and the computed t-statistic 
will decrease. To see how this fact yields the results in comparisons 
l., 2., and 3. above, consider the explanatory and dependent variables 


of the models (e) (normalized) and (g). Dropping for a moment the 


logarithm symbol, model (e) (normalized) has dependent variable; 


РТИ кети at 


and explanatory variable; 


WM... - WM.. - MM. en WM . 
Е . 


iG em na eeu, E MUN C ONE 


Ë D 


both of which have mean zero, while model (g) has dependent variable 


Rijt and explanatory variable Wijt Taking squared deviations from 


the mean for the variable Rijt: 
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2 
Ra et 
2 2 
Me) (Ree RY td) MR А аа 
ij i | it 
2 2 
Bh DR -R IOI GR, ROS 
jt t 
2 2 
ИШ Ма m) + ITÍAR - в. )“ > 
jJ J. ¡ es 
2 
: D M UR. * Rust R ER re 


since all terms in the above equation are non-negative. But the term 
on the right hand side of this inequality is the sum of squared devia- 
tions from the mean of the dependent variable in the normalized form of 
model (e). A similar result holds in the comparison of the sum of 
squared deviations from the mean of the explanatory variables in 
models (e) (normalized) and (g). And a similar result holds in the 
comparison of the models (f) (normalized) and (h) as well. Asa 
consequence, the results of comparisons 1., 2., and 3. are not unexpected. 
Now consider the estimates of 8 presented in Table I. All estimates 
of the military wage elasticity of the odds of reenlistment and the 
probability of reenlistment exceed one. Іп fact, the estimates of the 
elasticity of R with respect to WM cluster loosly about a value of 1.5, 
while the estimates of the elasticity of Lu with respect to WM have a 
median value of approximately 2. Since these estimates are based on a 
single choice for the construction of WM no great import will be assigned 


to them, except to note that they are not appreciably different from 
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estimates of these quantities obtained in other studies. For example, 
estimates of the WM elasticity of R in previous studies are generally 
confined to the range 0.8 to 3, with the bulk of the estimates lying 

in a range of values between 1 and 2. Note that in the normalized forms 
of models (e) and (f) the estimates of B for Mental Groups II and upper 
III seem to be appreciably higher than estimates of this coefficient 

for Mental Groups, I, lower III and IV (this apparent difference is 

not so marked for models (g) and (h); in any case models (g) and (h) 

are of interest here only for a comparison of results with the corres- 
ponding normalized forms of models (e) and (f), so that the former 
models will not be treated further). This result agrees very well with 
prior expectations: it indicates that personnel in the highest and 
lowest Mental Groups are less inclined toward reenlistment than men in 
the median Mental Groups. It can be argued that this result is reason- 
able since men in Mental Group I, who presumably possess greater 
intellectual ability, may find greater rewards and challenges in civilian 
life than in enlisted military service, while men in Mental Groups lower 
III and IV may often find themselves unable to compete for advancement 
successfully with men in higher Mental Groups, and may sometimes be 
unable to meet demands of competence placed on them by military service. 
For both the highest and lowest Mental Groups, then, enlisted military 
service may be viewed as limited in opportunity. To establish the 
validity of these initial observations it is desirable to determine if 
the estimates B contained in Table I do in fact estimate different 
coefficients 8 for different Mental Groups (that is, whether the 
same coefficient 8 applies for all Mental Groups or whether different 


coefficients 8; apply for different Mental Groups). 
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Toward this end a statistical test, in which the estimates B may 
be compared for each pair of Mental Groups in each of the models (e) 
(normalized) and (f) (normalized), is in order. Concentrate now on 
the normalized form of model (e). For the regression analysis of 
Mental Group i, i = 1, ...5, let ôf be the estimate of o”, B, be the 
estimate of Bas and n; be the number of observations. Since the 
estimated intercept for each least squares estimation using the 
normalized form of model (e) is zero, testing for the equality of the 
coefficients В. is equivalent to testing for the equality of the 
appropriate regression lines. Now if Mental Groups i and j yield the 


same regression line in the normalized form of model (e), then 5 апа 


8% both estimate the same variance s. And in this case, 





p п, | à )  (I-1)(0-1)(T-1) 0, 
—-1 — i with آل‎ 
1JT 0 IJT 


- 1 degrees of freedom, 





and 
E n; | 2 a 55 
— -] > ^ x with A n 
IJT 0 TUT 


-1 degrees of freedom, 
where these two Chi-squared random variables are independent since they 
are derived from two different (and assumed independent) populations of 
random variables. [See Section IV for the development of this asser- 
tion. Here I = 6] is the number of Rates, J = 9 is the number of pay 


grades and T = 7 is the number of years considered.) 
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Hence as the sum of two independent x random variables, the 


quantity: 


Сата .2 k oM 
ral ПТ | Ё 9; t n; 1 (o; + 23) 


һаѕ E distribution with: 


NOA 
IJT (n; * n3) - ? 


degrees of freedom. Now if Mental Groups i and j yield the same reg- 


ression line then 8: - 8, = 0, in which case B. - В, 15 Могта11у 


distributed with mean zero (since B. and В, are unbiased estimators of 


8; 7 81) and variance: 
2 2 
G G 
Var (B. - B.) - Var (B.) * Var (B;) E LE WT 7 
у йй) ) (х) ps 
k=] k=] 


where for convenience X, represents the Ken observation on the explana- 
tory variable for the normalized form of model (e), applied to Mental 


Group m = i,j. Hence: 


m мее een 0) 
b 1 А 1 8 
n n e 


(xL - х1) y (d - х0) 


As a consequence, under the composite hypothesis that б апа 8 estimate 


the same parameter E and that B: = Bis the quantity: 
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] 
(eigenen Mx 
(В. - 35) | 177 (n. 4 n;) 3 
] 1 1 
С n. Б п, 2 
ў (x! B х)“ ) (X - x)? 
k=] k=] 
(1-1) 7 
тло ту ~2 DEM M: 
o | | ^ a 5) A | 
Ш 
| (Ips yi - 
(B. Bs) | (n; + ї - 2 
| + | 7 (1-1 0 ka) Di 2 
n; n ПТ | чане -(0 +6) 2 
go cox (d - 71) - -— 
E. ei К 
has t-distribution with: 
ЕИ Е) 
[IT in, +n,)- 2 


degrees of freedom. Computing this statistic, for the normalized forms 
of models (e) and (f) separately, for each pair of Mental Groups, I, II, 
upper III, lower III and IV yields the results given in Table II. 

Note that for very high level of significance, none of the coeffici- 
ents В,» В, (for either model (e) or (f)) test significantly different 
from each other, so that for high chosen level of significance the com- 
posite null hypothesis that af and 2% both estimate common K and that 
E- в; cannot be rejected. But note that the magnitudes of the computed 


t-statistics for the most part give credence (especially in the normalized 


form of model (f)) to the observations that prompted this test: the sets 
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1,3 


2 
3 
1,4 
1,5 
23 
2,4 
2,5 
3,4 
9,9 
4,5 


(i,j) refers to the comparison of coefficients for Mental Groups i and j. 


t(R) is the computed t-statistic for the normalized form of model (e). 


TABLE II 


98 
87 
0.28 
0.51 
0.28 
О | 
1.17 
1.47 


df 


1481 
1284 
1141 

995 
1688 
1545 
1339 
1348 
1142 

999 


(ss) is the computed t-statistic for the normalized form of model (f) 


df is the appropriate degrees of freedom, 


I-1)(J- T AT-1 
UGD dm, + n, 


of the t-distribution to the nearest integer. 
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(82, 83) and (81, 84, 85) of coefficients may be accepted as being 
different from each other, and the coefficients within each of these sets 
may be accepted as being the same, at an appreciably higher level of 


significance than any other partition of the set (81, 82, 83, 84, 85) . 
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IV. FEATURES OF THE MODEL 


A. A MORE GENERAL CROSS-SECTIONAL MODEL 

Consider a slightly more general form of the reenlistment model. 
For simplicity in the derivation of results, suppose that three 
cross-sectional dimensions are involved. Let Y= XB + ZR +e, where 
Y is an n-vector of observations on the dependent variable, X is annx k 
matrix of observations on k explanatory variables, each of which varies 


over all cross-sectional dimensions (as did WM. in the reenlistment 


Jt 
model), ß is a k-vector of coefficients corresponding to the variables X, 
Z is ann x m matrix of observations on m explanatory variables, each of 


which varies over at most two cross-sectional dimensions (as did WU a 


and C. for example, in the reenlistment model), Q is an m-vector of 
coefficients corresponding to the variables in Z. Then it is evident that, 
if the observations are "normalized" as in the reenlistment model, the 
variables Z will disappear from the normalized data. So the model in 
normalized form becomes ЦЕ = An 8 + ers where, for example, the typical 


element of ise 


Eo. т E. o = E. = 


ijt ij eit TT dos КЕ E 


The procedure of normalizing data in this manner, then, is advantageous 
when it is desirable to rid the model of one or more of the variables in 
Z. For example, theoretical or practical considerations may dictate 
that a variable in Z be included in the model, but this variable may in 
practice turn out to be unobserved (as was WC, in the reenlistment 
model) or even unobservable (as was C. in the at model). An 


obvious disadvantage is that all the variables Z disappear in the 
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normalized data, so that none of the coefficients in 2 can be estimated 
using normalized observations. The normalization procedure can also be 
used to advantage to rid the model of disturbance terms of a certain 


form. This is the subject of a later part of this section. 


B. A NECESSARY IDEMPOTENT MATRIX 


Consider the set of all ordered triples of three indices, i, j, t: 
BIOS. TT j = |, ...de0t = |, 1 


There are IJT unique such ordered triples. Construct an IJT x IJT 


matrix, the rows and columns of which are each indexed with one of the 


th 


ordered triples (i, j, t), as follows: If the k” row of this matrix, 


call it V, is indexed with (in; Ју» t); then the Ken column of V is also 
indexed with SIT ji: t4). For the row of V indexed with ST ji: t1) 
and the column of V indexed with (15, jos t5); let the corresponding 
element of V be equal to 


- (2-1) (T-1)/IJT о 
~ (1-1) (7-1)/10т if 1j 7 1,, јр p ا‎ 
(1-1) (9-1)/197 DEEP LUN 
(T-1)/IJT ЗР aa 
IUD IT if i, # in, J) = dos ty # ty 
(1-1)/12Т mu =. 4. bit, 
-I/IJT ifi £15, 5, # dp» t, # ty 
(ТЕЛО) (ТЛ) ТОТ — df 44 9 15, J, = jp. t) = to 


Within each row and each column of V, then, there are (I-1) elements of 
the first type, (J-1) elements of the second type, (T-1) elements of the 


third type, (I-1)(J-1) elements of the fourth type, (I-1)(T-1) elements 
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of the fifth type, (J-1)(T-1) elements of the sixth type, (I-1)(J-1)(T-1) 
elements of the seventh type, and one element of the eighth type. 

From the symmetrical construction of V, it is apparent that V is 
symmetric. That V is singular is also apparent, since VN = 0, where N 
is the n-vector with unit elements (that is, the sum of the elements in 
each row and each column of V is equal to zero) and n = IJT. 


And it can be shown that V is indempotent as well: Let X be an 


arbitrary n x r matrix. For convenience of representation, let the men 


row of X be indexed with the same ordered triple (i, j, t) as the men 


row of V. Consider the gen k gen 


K 


column of VX. If X` is the column of X, 


then VX" is the kt" column of VX, so that without loss of generality it 
is necessary only to consider the case r = 1 in order to establish the 
form of VX. Let Ki j; u be a typical element of the n x ] matrix X. 


The the (17, Jy: a element of VX is of the form: 


I 
1 
= I-1)(J-1)(T-1) X; , - (J-1)(T-1 ii: 
E EDEN X, e UTD DOGS 


Pu, 
i=] j=] tel 
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I 
| 
ze c | dt Xo A ) 
IJ | 1101 57 151%, = IT : X 


| ji dj 
} ; 1 1 
J A Í EU X t 
SI hito jaa Is ae 19° 
ds det a] 
Т Nor - Kee. = 
aa O اا‎ 
1 1 | 
" EE ыа Еу. „жб ш, 
11414 I ijt J j 114 iV t 1151 


ae 2X. -X. -X. . 
11014 15 iet 1797: 11. 4: e 
That is, the matrix V is the linear transformation which reduces the 
original data X to data in the normalized form. 


Now consider the matrix product VVX. Let X. be the typical 


ДТ 
17171 
element of VVX, and let x; it represent the typical element of VA: 
A 
Eu "OUS t uou c fug n? e у 
IT ye ти i 
X " - 
J1 Te À 
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Analagous to the above derivation, 


0 О О О О 
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But: 
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So that: Keak 2 
d HAS 


In particular this holds for the vector X which has zeros in each ele- 


ment except the ken, which is equal to one. That is WX, 
h column of V. This holds 


= Vx, BUE 
WX, is the Ken column of VV, and VX is the kt 
for each k = 1 ... IJT, so that each column of VV is equal to the 
corresponding column of V. Hence VV = V, so that V is, by definition, 
idempotent. 

The idempotency of V can be seen equivalently as follows. Consider 
the equation VX = AX, where à is any eigenvalue of V, and X is a corres- 
ponding eigenvector (X # 0) by assumption). Pre-multiplying both sides 
of this equation by V yields: 

VVX = VAX = AVX = à X. 
But VVX = VX = àX, so that AX = rex. So either À = O or it is possible 
to divide by À to get X = XX. Or X'X = X'AX = AX'X, where X'X is a 
strictly positive scalar. Hence if À Z 0, then À = X'X/X'X=l. That is, 
for the matrix V, all eigenvalues are equal to 1 or to 0. Now the claim 
that V is indempotent can be made, since a sufficient condition for aà 
symmetric matrix to be indempotent is that each of its non-zero eigen- 
values be equal to unity. 

Now since V is indempotent, its rank is equal to its trace. And the 
trace of V is equal to the sum of its diagonal elements. That is, tr(V) 

= JJT [(1-1)(9-1) (1-1)/19T] = (1-1)(9-1)(T-1). Hence the rank of V 
is (I-1)(J-1)(T-1). 


C. ORDINARY LEAST SQUARES ESTIMATION UNDER THE TRANSFORMATION V 
Consider once again the model described in Section A, Y = X8 + ZQ + £ 


where Y,X,8,Z,2 and e are as defined there. Recall that the number of 
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cross-sectional dimensions involved was assumed, for purely illustrative 
purposes, to be three. Suppose that one cross-sectional dimension is 
resolved into I categories, the second dimension into J categories, and 

the third dimension into T categories. Then there are n = IJT observa- 
tions in Y, and to each observation in Y there can be assigned a unique 
ordered triple (i,j,t) which represents the appropriate category of each 

of the cross-sectional dimensions for that observation in Y. Obviously 
this same ordered triple is assigned to the corresponding observations 

of the variables in X and in Z, as well as to the corresponding element 

of e. Now suppose that the matrix V has been constructed so that the index 


h th observation in Y. 


of the р" row of V is equal to the index of the р 
Then pre-multiplying the above equation by V yields VY = V X 8 + VZQ + Ve, 
where VZ = 0 апа VY # 0 # VX since by assumption the dependent variable 
whose observations are represented by Y and the k explanatory variables 
whose observations are represented by X vary over all cross-sectional 
dimensions, while the variables whose observations are represented by Z 
vary over at most two cross-sectional dimensions. So the equation 

becomes VY = V X B + Ve. 

Note that the above property provides a concise operational defini- 
tion of the phrase "varies over all cross-sectional dimensions." A non- 
stochastic variable whose vector of observations, over all possible 
categories of the cross-sectional dimensions, is given by W may be said 
to vary over all cross-sectional dimensions if VW # 0. It will be shown 
in a later section that the element of VW which is indexed by (i,j,t) may 


be interpreted as the three-way interaction of the ith 


th 


category of one 
cross-sectional dimension, the j^ category of the second dimension, and 


the gen category of the third dimension. Similarly, for a stochastic 
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variable whose vector of observations is given by W, the element of VW 
indexed by (i,j,t) may be interpreted as the sample estimate of this 
three-way interaction term. 

Now in order to discuss the ordinary least squares estimator of g in 
the equation VY = V X g * Ve it is necessary to consider the rank of VX. 


Suppose that r (X) = k (k < n), so that (X' ys 


exists. If it were the 

case that r (X) < k, then the coefficient vector g in the equation 

Y = XB + 49 +ewould be inestimable in the original data, since a necess- 
ary condition for the ordinary least squares estimators, in the original 

data, of B and Q to exist is that both X' X and Z' Z are nonsingular. 


That is, these estimators in the original data, in partitioned matrix 
form, 


MA | x' Y 

ML x IZ R Y | | 
exist only if (X' yy! and (Z' jl exist. So the assumption that r (X) 
- k is no more restrictive in the ordinary least squares estimation of В 
using data in the form VY, VX than it was in the ordinary least squares 
estimationof8 using the original data Y, X. [Note that this discussion 
applies only to estimation of the originally specified k-vector 8 of 
coefficients. It may of course be possible, even if r (X) « k, to 
estimate a linear combination of some of the coefficients in в. But this 
is not the goal here.] Now since r (V) = (I-1)(J-1)(T-1), a necessary 
condition for (VX)' (VX) = X' VX to be nonsingular is that r (VX) = K. 
So a necessary condition is that К < (I-1)(J-1)(T-1). That is, that the 
matrix X represents observations on at most (I-1)(J-1)(T-1) explanatory 
variabies. Consequently, in all discussion hereafter, the requirement 


that K < (I-1)(J-1)(T-1) <IJT = n will be made. 
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Additionally, the requirement that r (VX) = k means that the columns 
of VX must be linearly independent. But these are simply the vectors 
which represent the three-way interaction terms for each variable in X. 
This is a new restriction, not encountered when basing estimators upon 
the original observations. It may turn out, in some cases, to prohibit 
application of V in the model. It is certainly not prohibitive when X 
represents observations on only one explanatory variable (as was the case 
for WM. it in the reenlistment model). It may be worth noting that the 
circumstances in which r (VX) < k can be stated more succinctly: r (VX) 

« k if and only if some linear combination of the vectors in X is in the 
null space of the transformation V. 

If r (VX) = k, then X'VX is nonsingular, and the ordinary least 
squares estimator, under the transformation V, for B in Y = XB+ ZQ + € 
is B= (WI (0)? (vo (Y) 5 Gov) ew. 

A definition of terms should now be made. B, in the equation above, 
has been called an estimator for g uncer the transformation V. But it 
is clear that if B is linear in VY, then it is also linear in Y. That 
is, for any linear transformation A, A(VY) = CY for some linear transforma- 
tion C. The reason for this apparently unnecessary terminology is that 
this estimator B is the best linear unbiased estimator for ß (it will be 
shown later) among all those unbiased estimators for g that are linear in VY. 
[The definition of "Best" used throughout this paper is that employed in 
the Gauss-Markov theorem. An estimator 8 for g in the equation Y = Xg + 
ZQ +e is best linear unbiased 1f it is linear in Y, if it is unbiased 
and if any other estimator of 8 which is also linear in Y and unbiased 
has a covariance matrix which exceeds that of 8 by a positive semidefinite 


matrix. ] That B can be the best unbiased estimator linear in VY and 
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yet not be the best unbiased estimator linear in Y is clear, since the 
transformation V is not invertible. That is, no linear transformation on 
VW can reproduce W. If this were possible, then there would exist some 
matrix A such that AVW = W for all W. But since V is singular, there 
must exist a vector M. (not identically zero) such that VW, = 0. 
Specifically, N. = N can be the n-vector with unit elements. So AVW, = 
A0=04A W,. [Equivalently, V is not isomorphic. It has null space 
S = {W:VW = 0). Consequently, V maps all vectors of the form Z + cN, 
where c is a scalar and N the n-vector of unit elements, into the vector 
VZ.] In addition to being the best linear unbiased estimator for 8 
under the transformation V, B is in many cases the best linear unbiased 
estimator for 8 as well. This is the subject of the next part of this 
section. 
D. POOLED TIME SERIES AND CROSS-SECTION DATA: EFFECT OF THE COMPOSITION 

OF THE DISTURBANCE TERM ON THE MODEL 

The ordinary least squares estimator for 8, under V, shows a degree 
of insensitivity in its quality of "best linear unbiasedness under V" to 
the composition of the disturbance term of the model. The type of 
composition of the disturbance term for which the property of best 


linear unbiasedness, under V, of B is invariant is considered here. 


It may happen that in a regression model involving time series and 
cross-section data the disturbance term for an observation is composed 
of effects due to the cross-section, an effect due to the time series, 
and a series of remainder terms (that is, components of the disturbance 
term which are due to the joint effects of cross-section and time 


series)^ For example, the disturbance term Eit for economic entity i, 


Ins postulated by, for example, Kuh [11] and Chetty [12]. 
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subject to factor j at time t may be given by: 


NA ON ar 
d. E n En 6 


tì.. tøy- * T. 
ijt ijt ^ij Wit Tit? where 


t 1 


HEC. s 0EN- |... L2) 1550, tos oT 


ы 
r Var (n...) = a^ ton ali. Jo L 
e. 13T 3 3 
4, DES are independent, Normally distributed random variables 


5. No statements can be made concerning the distributions of the 


random variables ај» Үдэ 84» Ais? Wits Tit ° 


6. No statements can be made concerning the independence, or correla- 
tions, of the random variables Dispo 94? Y 53 б Е, hig? wit Tjt 
(other than as in 4. above) 

7. Each random variable is invariant over any dimension not included 

as a subscript in its notational expression. 

The disturbance structure hypothesized here is central to later work. 
For ease of reference, call the error structure formally assumed by 
statements 1. through 7. above "disturbance structure (A)." 

Under the specifications of disturbance structure (A), no conclusion 
can be made about the form of E (e) or Var (e). Consequently no claims 
can be made regarding the unbiasedness of the ordinary least squares 
estimator for g in the original data. And the generalized least square 
estimator is unknown, since Var (e) is unknown. But for e = [ei jtd and 
n = LM as specified above, Ve= Vn , since Va = Vy = V6 = VA = Vu 
= үт = 0. Hence under disturbance structure (A) the ordinary least 


Squares estimator, under V, for 8 is unbiased: 
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В Түзүү 


(ХА С 


l! 


E(B) = EL(x'vx)7 xtvy] MC CC s = 
B + CU І X'VE(n) =8+0= в 


And the variance of B is given by: 


Var (B) =  E[(B-8)(B-8)'] - 


E[(X'VX) I x'vee' VX(X'VX) ]J = 


E[(X'VX) I x'vnn' VX(X'VX) |] 
1 


(X'VX) ] x'vE(nn') VX(X'VX) 


you = 


o° (X'VX) ] x'vx(x'VX)"] = 2 (x'vx) !, 


o^ (X'VX) 


since E(nn') = сё І, and since V is idempotent. 

It is now possible to show that, under disturbance structure (A), 
B is the best linear unbiased estimator, under V, for 8. But it is first 
worthwhile to show that any linear transformation which has null space 
identical to that of V (that is, any linear transformation which maps 
precisely the same vectors onto the null vector) is itself a linear 
transformation, under a nonsingular matrix, of V. That is, that the 


matrix V which removes the stochastic variables a., y., 6 d 


TA Aid? ui 
Tit from the disturbance term, and under which the image of a vector 
[ns s] which varies over all dimensions is non-null, is unique up to a 
nonsingular linear transformation C. Suppose there exists another linear 
transformation, say A, such that Ae = An (Aa = Ay = AS = Aa = Av = Ar = 0), 
for all n-vectors e. Then since A and V are to have the same null space, 
AX = 0 if and only if VX = 0. In particular, this must hold for the vec- 
tor VX: AVX = 0, if and only if VVX = YX = 0. An equivalent statement is 
that the system A(VX) = 0 has only the trivial solution VX = 0. Hence 


either A is nonsingular or A = CV for nonsingular C (in the latter case 


Dd 








AVX = CVVX = CVX and AX = CVX). But if A is nonsingular, then AX = Û 
implies that X = 0. So, for nonsingular A, A and V could not have the 
same null space. Hence A = CV, for nonsingular C. 

Now since CV, for nonsingular C, is the only linear transformation 


which removes stochastic variables a. from the 


С c 
model, any other unbiased estimator of g must be linear in CVY, hence 
in VY. Consider any other such estimator, say AVY, where Ais ak xn 
matrix independent of Y. 
let D =A - (X'VX) xv. 
Then AVY = [D + (X'VX)7Îx'] VY = 

[D + (X'VX)TÎx'] [V X 6 + Ve] = 

[DVX + I] get [D + (vx) ix] Ve. 


But E(AWY) = (DVX + I) g + [D + GC VX) !x'] E(Ve) 


1 


(DVX + I) 8 + [D + (X'VX) !x'v] E(n) 
(DVX + 1) 8. 


So in order for AVY to be unbiased, it is necessary that DVX = 0. So the 
estimator becomes ß + [D + (X'VX) lx']j Ve. The corresponding sampling 


error is [D + (X'VX) !x'J V e, and the covariance matrix is: 


E[(DV + (X'VX) ]X'V )Vee'V(VD' + VX(X'VX) !] )] 
Ut OTO DECOM I 
o“ [DV + (X'VX)7 XV] [VD' + VX(X'VX) IJ = 


1 


e [DVD' + DVX(X'VX) 7 Cen) Ора (X'VX) lx'vx(X'Vvx) !J = 


с2 [DVD' + (X'VX) 1]. 
So the covariance matrix of the estimator AVY exceeds the covariance 


matrix of B = (X'VX) !x'vy by DVD', a positive semidefinite matrix. Hence 


B is the best linear unbiased estimator under V in the sense that its 
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covariance matrix is exceeded, by a positive semidefinite matrix, by the 
covariance matrix of any other linear unbiased estimator of ß under V. 

And, since B is the best linear unbiased estimator for ß under Y, 
and since only tnose estimators linear in VY can claim to be unbiased, 
the estimator B is the best linear unbiased estimator for ß under 
disturbance structure (A). 

The discussion of the hypothesized error structure has been couched 
in terms of pooled cross-section and time series data. But in any 
regression model involving cross-sectional data (no matter what the nature 
of the cross-sectional dimensions) it is clear that, if no more specific 
statement about the error structure can be made than that disturbance 


] 


structure (A) applies, then B = (X'VX) X'VY is the best linear unbiased 


estimator for 8. 


E. AN UNBIASED ESTIMATOR FOR c* 
Assume disturbance structure (A) from the preceeding section applies. 


The purpose of this section is to show that: 


s2 


= e'e/[(I-1)(J-1)(T-1)-k] 
is an unbiased estimator for ye in 


Гав (6) = (ою o^ 


Consider the estimator B = (X'VX) !x'vy of B in the model: 
Ү= ХВ + 19 + є, VY = V X8 + Ve. 
The residual vector is e = VY - VXB = VY - VX(X'VX) EX yy = 


[V-VX(X'VX) Ix'v] v. Let M = V-VX(X'VX) ]x'V. Then e = MY and M is an 
idempotent matrix with trace (I-1)(J-1)(T-1)-k. To see the idempotency 
of M: 
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MM = [V-VX(X'VX) xv] vo lv] = 


V-VX(X'VX) ev - acervo ev s vxoevxy evxoevxy ev = 


y-xonv7 bey - acervo lx e vxoevxy ey = 


V-VX(X'VX) ]X'V = M. 


To see tr (M) = (1-1)(9-1)(T-1) - k: 
Since the trace of the difference of two matrices is equal to the 
difference of the traces, 


-] 


tr(M) = tr(V) - tr(VXQUVX) X'V) 


(1-1) (J-1) (T-1) - tr(vx(x'vx)7! 


X'V) 
And since for two matrices A, B, of compatible order, tr (AB) = tr(BA), 


] 


tr(M) = (I=1)(9-1)(T-1) - tr((X'VX) x'vx) = 


(1-1)(9-1) (1-1) - tr(1,) = (I-1)(J-1)(T-1) - k, 
where m is the identity matrix of order k. 


The residual vector may also be written, e = MY = MVY = MV (Xg + e) 
= MVe, since MVX = VX - VX(x'vx)7!x'vx = VX - VX = Û. 

So the error sum of squares is e'e = e'VM'MVe = e'VMVe = n'VMVn = 
n'M n, since Ve = Vn. And, since n'Mn is scalar, it is equal to its own 
trace: e'e = tr(n'Mn). And since tr(AB) = tr(BA), e'e = tr(n'Mn) = 
tr(Mnn'). And since the trace of a square matrix is a linear operation 
on the matrix, the expected value of the trace is equal to the trace of 


the expected value: 


E(e'e) » E[tr(Mnn')] 7» tr[E(Mnn')] = tr[ME(nn')] = tr[c^MI] = 
tr[o M] = o° tr(M), 


since for a scalar k and matrix A, tr(kA) = k tr(A). 
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F. THE JOINT DISTRIBUTION OF B AND S 

A theorem with application in statistical analysis may be expressed 
as follows: If A is an idempotent matrix and „ is an n-variate Nromal 
random variable from a N(0,0°) distribution, then the quadratic form 
u Au is distributed < with q degrees of freedom , where q = tr(A) = 
rank of n This theorem can be applied to the results of the preceed- 
ing section which showed that e'e = n'Mn , where M is idempotent and the 
elements of n are independent identically distributed Normal random 
variables, each with mean zero and variance se. By the theorem, e'e/o* 
is distributed x with (1-1)(9-1)(T-1) - k degrees of freedom. 

Now consider the estimator B for g. It has already been shown that 
Е (В) = 8 and 
e 1 


ZEND -g(X'VX) . 


And B = G'vxy xtvy = (x'vx)7 xtv(xg + e) = 
T 1 


O XIS TOX) Хе 


NT 


So, since B is linear in the components of n, B has a multivariate normal 

distribution also 

2 
( 


B - N(8, o (X'VX) ] ), 


It can now be shown that the Chi-square and Normal distributions described 


above are independent. Note that e'e/c* = n'Mn/ с is an idempotent 


For a proff of this theorem, as well as of the converse implication, 
see Hogg, R., and Craig, A., Introduction to Mathematical Statistics, 
pp. 348-351, MacMillan, 1965. 
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quadratic form in n, and that В = 8 + QU VX) X Vn is a vector whose 
elements are linear in n, where the components of n are independent 
identically distributed random variables. A sufficient condition for 
e'e/c* and B to be statistically independent is that the product of 

(X'VX) !x'v and M be equal to the null vector.® That this is so is easily 


verified: 


Coevo "lx v] M = 


[(X'VX) lxiv] [v-VX(X'VX) lx'vJ = 





(X'VX) lev - av ewxoevxy lev. = 
(xX'VX) hev- oryxy hy = T 
Hence e'e/o* and B are independent. 
Now since: 
s? = ele s d ene 
REN Gel) =e Enon г Т = 


is linear in e'e/c*, $° and B are independent as well. 
As a consequence, it is now possible to get a joint distribution of 
$° and a linear combination of the components of B.  NowB - g ^N (O, 


c^ (x vxy by. Let W be a k-vector of constants. 


] e 


Then W'(B-8) ^ N(O, W'(X'VX) Mo 


And W' (B-8) 


[о W'(X'VX) WI 


For a proof of this assertion, see Theil, H., Principles of 
Econometrics, pp. 83-84, Wiley, 1971. 
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So that, since B and = are independent, 


| 172 
(8-11) - К 
[c" W'(X'VX) WJ’ ape 
172 - ج‎ yo has 
ES — ج‎ 
ASAS stu'oevxy lu] 


t-distribution with (I-1)(J-1)(T-1)kdegrees of freedom. 
So a confidence interval for W'8, is a linear combination of the 
elements of 8, is given by 


] 
W'Bst, as morvo hw 2 l 
- 7 


where ty_o is the 100 (l-a percentile of a t-distribution with 
2 


(I-1)(J-1)(T-1) - k degrees of freedom. 


In particular this holds for a vector Wo which has zeros in each 


th element which is equal to one. Applica- 


h 


component, except for the p 
tion of this vector Wo will give a confidence interval for the pt 


ean ponent of 6, p= 1, ...k. 


G. AN ALTERNATE DERIVATION OF V 

The calculations which yield the elements of the matrix V, introduced 
in Section B , may not be apparent. The purpose of the present section 
15 to delinate the sequence of steps that lead to the elements of V. 

As a vehicle, consider a disturbance term of the form, once again, 
(1) ЕЕ + о; + E +6, + SE + ws, + ETE where nothing is 
known or can be reasonably assumed about the components of the E 
except that the dean 5 are independent Normal random variables, each 


: : 2 
with mean zero and variance o` 4 
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Now: 
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(8) Ten ыу 
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Adding and subtracting (1)-(2)-(3)-(4)+(5)+(6)+(7)-(8), the disturbance 


th 


term for the ijt” observation in normalized data becomes: 


MU “jt 51j. “it Sjt fi. j. et 


"jt P jJ. P: ni t = n it T nae + n z t П - DE = 


1 
ШТ. IJT Í - JI ) П; ijt 


ШО КОО ОД О) Ул у) пел у уу. 
it 1JL "g 1Jt ij 1Jt idt IJt 


The equations (2) through (8) above were written out in the inconvenient 
summative form to make obvious the fact that the variables аф» Үјэ б э 
Ai? Wit and Tit disappear completely from the disturbance term of the 
normalized model. This is so since the equations (1) tnrough (8) are 
written in terms of the random variables themselves, not in terms of 
realizations of these random variables. These random variables also 
disappear, of course, in the event that one or more of them is degenerate, 
as might happen if an unobservable explanatory variable were implicitly 
included in the disturbance term Cait: 

The expression for Bist consists of adding and subtracting various 
multiples of given random variables. But in this expression any random 
variable Mo may be included under more than one summation sign. 
Concentrate on one normalized disturbance term, say "ij t^ and rearrange 
terms in the series of summations so that each random variable n 


The 
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appears once and only once in the expression for Use : 
Pel 


m 
i > = Û n. , - JT ) Hr. - IT ) 2 - 
14315 IJT 11344 ; 13147 G 1174 
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) | 13t, ) : 13,1 | } ijt 
ifi, JfJi ifi, tit j#j, t#t, 


т ) ) NE | : 
е АВ 
ifi, JFJ tft 

So that Hi jt is a series of summations of independent, identically 
A 

distributed Normal random variables. 


Since each of these random variables nist has mean zero and variance 


о, 1t is clear that: 


Eure) = 
11314 


and 


2 
Var (u; - t NE (тїт) Var |) n 


1130 hh 


те разв т 00-0) д DEM Eon 
Ј 


iti. JA, tft, 
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т (Т-1) ; ; B E (J-1) | l "азр + (1-1) l ) "ijt 
ifi, j#ji ifi, tft 373] 
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ifi, 
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Note that this applies for all y, And since u.., is a linear combina- 


TO MS 
tion of independent, identically distributed Normal random variables, 


Mist is also Normally distributed. 
Note that the diagonal elements of the covariance matrix E(uu') are 
each КОО) (о-1)(Т-1}/1277. But also note that, since each of the y..,'s 


IJL 


is a linear combination of the same IJT random variables nist? 1=1,...І, 


Da), t=1,...T, the н; 'S are not independent. 


JL 
The remainder of the covariance matrix may be found by straightforward 


but tedious calculations. Since Ely ) = 0, these calculations (using 


ijt 
the summative expression in the n.-.,'s for each u. . , ) yield 
IIL ولا‎ 
иь E(us «= « We ca) 
iQ. 1292©2 119141 123205 
2 
-(J-1)(T-1)o ee ae = _ 
ЫТ ш IF 1] F Vos Jy = doe ty = ty 
2 
-(I-1)(T-1)o Өрт 00 : : _ 
— IT if 1) = 19s tb 
2 
-(1-1)(Y-1)0 t en a XD 
MA Tf 1) = Ips J) = Joo ty # ty 
mS E e 
N ¡AAA ee 2 
2 
(J-1)o tet. | a: 
ET DE ж ост to 
2 
(1-1) E. ase | 
ae! ШИЕ оли rt 
-g* 5 У | : К 
IT if 1) F Ips J) F Jo, t Ft) 


So that, for the matrix previously defined, с? V = E(uu'). 


H. THE CASE WHEN FEVER THAN IJT OBSERVATIONS ARE USED 
Suppose the components of the disturbance term are independent 


identically distributed Normal random variables with mean zero. Then 
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for ordinary least squares estimation in the original data the quantity 


(IJT - k) 55 /сё has „ё distribution with IJT - k degrees of freedom, 


where só = e'e/(IJT - k) is the estimator in the original data of E 


When normalized data are used the quantity: 


2 | 2 
10-1) (0-10(7-1) - К 35 > ens cn. ыт «| s 


O 


has x° distribution with (I-1)(J-1)(T-1) - x » M-DG-DU-D pr. к 
degrees of freedom, for $° the estimator of сё previously derived. In 
addition, the latter distribution still applies when disturbance 
structure (A) is assumed. An analagous relationship holds when n < IJT 
observations are used in the least squares estimation (such a case might 
arise when some observations must be discarded for one reason or another). 
In this case, for ordinary least squares estimation in the original data 


the quantity (n - k) 5 / E has x* distribution with n - k degrees of 


freedom. It is desired to show the analagous distribution (in sé) when 
normalized data are used. But when not all observations are allowed, 
the method of "normalizing" the remaining observations is not obvious. 
The most straightforward approach is to take the appropriate means, in 
the normalization process, over those observations that are available. 
Then, for example, the normalization of the Ger observation on 


the dependent variable (which is assumed to be used) still has the form: 


Vijt Jij. 7T Yi.t 7 Y.jt Yi... Yj. TV tN, o 
where now 
1 
(*) ШЕ = ——  } ijt 
ШЕ ИШ) ter(1,3) 
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where, for example, T(i,j) is the set of all years in which the observa- 
tions of Yt’ for Rate i and pay grade j, are used and ||T(i,j)|| is the 
number of elements in T(i,j). The normalized value of any observation 
which is not used in the least squares estimation is taken to be zero. 
The same form applies for normalization of the explanatory variables in 
X. With a little reflection it is seen that, in effect, this normaliza- 
tion process implicitly takes the value of an unused observation of any 
variable to be the sum of the appropriate means over observations which 
are in fact used. That is, an unused observation ES is taken to be 


equal to: 


Yiit ТУ. ТУТУТУ, УУ TY > 


where the terms on the right hand side of this equation are as given in 

(*) above. In particular, this modified normalization process is applied 
to the disturbance terms Eijt 
(n < IJT is the number of observations used) of disturbance terms under 


as well. Let y represent the n-vector 
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the modified normalization. Define, as in the preceeding section, 


Vo = L ER s 
G 


where the matrix Vo has order n « IJT. Note that the diagonal element 


of Vo which corresponds to observation (i,j,t) is equal to: 


OIGO 1-00 UA [ЁШ БИЙГЕ 
ш И ЕГ ШШ || 


since it represents the variance of a component of yu derived through the 
modified normalization specified in (*) above. Thus, the trace of Vo is 
equal to: 
| | ПП, а, ЦЕ К ОШ 
ieUI(j,t) cuit) tUT(,3) — [IG оС, n 


Note also that Vo is symmetric and that for an arbitrary n-component 


disturbance vector e, VoYoe = Voe , So that Vo is idempotent. That 


this is so is clear since for Eije’ Ei p? € jt Ej o’ ERE € + and 
D ! } A E s : 
e as specified in the equations (*), Vo ЕТТ Vo є; + Vo E jt 
Vo > ү € i - Jr A um CM I 0. The matrix Ма has properties 


analagous to the matrix V considered previously, and represents the 
linear transformation which projects an n-vector of observations into 
the modified normalization of that vector. 


Now let N(n) = triv.) = 


; ; (ШО ОЕШ А ОПЕ 
ieUI(,t) jeUJ(1,t) teUT(,3) HODI 30,91] ITG) 


and let My = Ya - Ve Ka X Ya where X is now the n x k matrix of 
observations which results from removing the IJT-n unused observations 


from the original IJT x k matrix of observations X. ‘Then the error sum 
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of squares for the least squares estimation in modified normalized form 
of the data (with unused observations removed) is e'e = e" M e where n 
is an idempotent matrix of rank N(n) - k. That M, is idempotent is 

: = I -] I I -] I = 
Elgar since MM. - LV, - Y XX VX) Х Vo Гуе = VX(X VX) X ү] E 


I -] t I -] I I -] t I -] i = 
Vo - YX Vx) X Vo - Y XA VX) X MS + V X (X V X) X Y ¿XX Vox) X ү = 


t -] I = ° E 
ir - YX Vox) ХМ = М. Ап M, has trace (hence rank) N(n) - k since: 


= t -] I E 
tr(M) = tr[V, - YX VX) X Vo] = 


tr(V.) XAO xv] = 


tr(V,) = trDoev xQev xn! e 


N(n) - k. Hence for disturbance term e specified by: 


+ т. 


ШЕ : 


... = n.., ta. ty. + + А.. + в. 
Я n d Y} ôt Aij ЕТЕ 


where nit Š are independent identically distributed Normal random 


variables with mean zero and variance A Ly eMe has distribution 


O 
with N(n) - k degrees of freedom. Thus, for the estimator: 


уз 70 6 Mos 2 
° N(n) - k N(n) - k EC 
[N(n) - К] $°/в° has x* distribution with N(n) - k degrees of freedom. 


For those cases in which the removal of observations is not systematic 
(that is, when observations are discarded in no regular pattern), computa- 
tion of N(n) may involve many computations and may require that one keep 
track of a large number of values of |{I(j,t)||, 110(1,%) || апа | |Т(1,Ј) 11. 
It may therefore, be beneficial to derive the distribution of an alternative 


random variable linear in Se. The quantity: 


|= upa) n. | In) | ' 2 — b 
N(n) - k E 


22 

‚ш 
2 

G 


IJT 


67 





is linear in [N(n) - k] se 
2 


G 


hence has x2 distribution, with degrees of freedom given by: 


Mn (ТЗ st | 
[ln pl. | s|: 


IJT 


2 
ren. (37) J 


(I-1)(J-1)(T-1)n 
IJT 


Thus the analogy is completed. 


I. GENERALIZATION TO q CROSS-SECTIONS 

There is a natural generalization of all of the preceeding sections 
to the case in which q cross-sectional dimensions are involved. 
Previously, recall, all was described in terms of three cross-sectional 
dimensions. 

Suppose q cross-sectional dimensions are being considered in the 
model Y = XB + ZR + e. Analagously to the case for q = 3, let the 
variables whose observations are represented by X and Y vary over all 
q dimensions, and let each variable in Z vary over at most q - 1 dimen- 
sions. Also let the disturbance term e be constructed analagously to 
the previously considered case, q = 3. That is, for q cross-sectional 
dimensions, with respective numbers of categories I,,---1q, e is a 
linear combination of: q 

T 


I 
к=] КЁ 


random vectors, one of which varies over q cross-sectional dimensions 
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(let this single random vector be denoted as n, as before, where the 
elements of n are written with q subscripts) and the remaining 


T I, - |] 
Кет 


of which vary over at most q - 1 dimensions (that is, the elements of 
each of these remaining random vectors are written with fewer than q 
subscripts). Also, the elements of n are independent, identically 
distributed Normal random variables, each with mean zero and variance 


E and the remaining q 


random variables are subject to any unknown distributions, and to any 
unknown conditions of stochastic non-independence. 

All the properties that have been derived in preceeding sections 
flowed naturally from a knowledge of the idempotent matrix V. Thus, in 
order to characterize the general case for q cross-sectional dimensions, 
it is only necessary to find the appropriate matrix Y whose properties 
are analagous to those of the previously defined V. To this end, let 
Cis be the subscript (in the notational expression for the elements of 
n; there are q such subscripts in the notational expression for each 


th h 


element of n) representing the i" category of the i. cross-sectional 


dimension, j = 1...q, 1 = 1,...l.. 


J 
Then the elements of Va = 5 Elm!) are given, for E = 1,...1Һ› 
. G +q 
DEM. by ECC, ,,...0€. . "C. ,...C. ) 9 (C) (1 
t È 17) 109 Jy! Ја mes " 
where: 
S= Ím = : pesi! cde, e 
| im т pm 


and p is the number of elements in S. When S is empty, define Е E 
E 
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That is: S is the set of all cross-sectional dimensions for which the 


subscripts C; 


. . n 
E and Ci m are equal in the variables п q and 


q 

"C, 72+--C; 4» whose covariance is an element of V . Or: S is the set 
Jy JS q 

of all cross-sectional dimensions for which the above two random vari- 

ables correspond to the same category. Note that the set S depends on 

the two elements of n whose covariance is being considered. 


To complete the analogy to the case q = 3, V. is an idempotent matrix 


q 
of order q 
m I 
k=1 * 
and trace (=rank) q 
TT en = 1) 
k=] 


J. THE INAPPROPRIATELY APPLIED MODEL: A CASE IN WHICH DISTURBANCE 
STRUCTURE (A) DOES NOT APPLY 


Before proceeding with this section, it may be instructive to amplify 
on the derivation of the transformation V. Note that the originally 
stated purpose of the transformation V was to rid the model Y = XB +t ZQ 
+ e of the effects of certain unobserved or unobservable explanatory 
variables. The disturbance structure (A) hypothesized in Part D 
was constructed, more or less artifically, to take advantage of the pro- 
perties of V. Disturbance structure (A) is simply the most general case 
of the original problem: it contains all possible sources of error which 
the transformation V is able to remove. Consider a model of the form 
Y = Xg + ZQ + e as previously introduced. Then the following statements 
are equivalent: 

a. e obeys disturbance structure (A): 

b. The elements of s are independent, identically distributed Normal 

random variables, each with mean zero and variance Es and 


included in the specification of the model (specifically, in Z) 
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is any variable (observed or not) which may be written as vary- 
ing over fewer than q cross-sectional dimensions (q is the total 
number of dimensions involved in the data). 
с. No knowledge or information about the disturbance term e may 
reasonably be assumed except that at least one component of each 
Et is a sample from a Normal population with mean zero and 
variance Ol 
This situation suggests two useful observations. The first concerns 
the unobserved or unobservable explanatory variables which, by the 
dictates of theory (that is, theory relating to the subject being modeled) 
or other considerations, are necessarily included in some model of the 
form considered here. Note that, since the transformation V rids the 
model of these variables (as long as each of these variables varies over 
fewer than q cross-sectional dimensions, where q is the total number of 
dimensions involved) in any case, it is conceptually and practically 
equivalent whether these variables are explicitly included in the formal 
form of the model, or whether they are implicitly "thrown into" the 
disturbance term. This is a trite observation, but it is well worth 
noting for the following reason: some studies and analyses (see, for 
example, Nerlove [8]), when implicitly including an unobserved or 
unobservable explanatory variable as a component of the disturbance term, 
make a strong and possibly erroneous’ assumption in order to complete the 


regression analysis (that is, in order to be able to claim an unbiased 


The term "erroneous" should be seen in context. The case of interest 


here is that in which there exists some unobserved explanatory vari- 
able which is expected to have a significant effect on the dependent 
variable. In addition, it is supposed that the analyst has no (or 
does not care to get any) information about the values of this 
variable. Such a variable may indeed not even by quantifiable. 
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estimator of the regression coefficients) without using some transforma- 
tion such as V to purge the model of the offending variable. Specifically, 
the required? assumption is that the disturbance term (which now implicity 
includes unobserved and unobservable explanatory variables) has known 
mean, usually zero. [It is further typically assumed that the disturbance 
term is Normally distributed, although this assumption is not necessary 

if all one wishes to do is ensure that the estimator is unbiased.] That 
this assumption may be erroneous can be seen in two approaches to the 
assumption. One may simply make this assumption with no justification. 
But since theory, or other consideration, has dictated that the unobserved 
explanatory variables does have an effect on the dependent variable, the 
Original problem still remains. And the resolution to that problem is 
still to remove the offending explanatory variable (whether explicitly 
included in the model or implicitly included as a component of the 
disturbance term) by some transformation such as We Alternatively, one 
may attempt to justify the assumption by means of some device such as 

the Central Limit Theorem, in this case making the additional assumption 
that the components of the disturbance term, which now includes the un- 


observed explanatory variables, are independent. Ignoring for the moment 


This assumption is characterized as "required" since unless it is 
made, some unobserved explanatory variable is, in effect, still 
being considered an explicit term in the model. 

Note that V may not be unique in this respect. For example, in 

the model 

Yen EE + BK 4 + YZ; jme 

where one wishes to purge Z 

where: 


eee 

T the transformation W may be used, 

ИУ +] P [У Е y; J, W[X..] = [X.. = X. ], WLZ, 1 š 

E MI ORBI MIS. cac call |) Ze 
Here [Pit] is an n-vector whose elements are Pit. 
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the fact that this latter assumption 1s contrary to the assumptions of 
disturbance structure (A), this sort of argument may be reasonable in 
some cases. But in justifying the application of the Central Limit 
Theorem, in order to approximate a Normal random variable of known mean 

by a sum of random variables, one typically assumes that the disturbance 
term represents the net effect of numerous individually unimportant but 
collectively significant variables. But this is clearly not the case (at 
least this latest assumption cannot reasonably be made) when disturbance 
structure (A) pertains. And, more generally, it can be said that there 
are certainly studies of interest where this is not the case: the un- 
observed explanatory variable whose inclusion in the model was a necessity 
cannot in general be assumed not to dominate the disturbance term in which 
it is incorporated. In summary, there exist studies for which the use of 
a transformation such as V, to rid the model of undesired variables, is 
unavoidable if an unbiased estimator of the regression coefficients is 

to be obtained. Simply discarding an undesired variable as a component 
of a disturbance term with known mean should be viewed cautiously. As 

an example, in the reenlistment model, the inclusion of the terms WO. 

and C+ in the disturbance term can be expected to have a large effect on 
the disturbance term. 

The second observation concerns the best linear unbiasedness of the 
estimator B = (X'VX) !x'vy for g in Y = XB +22 +e. Recall that when 
disturbance structure (A) is assumed, B is the best linear unbiased 
estimator for g. Note that since, in disturbance structure (A), the ran- 
dom variables a, y; 6; A, w and п тау assume any (unknown) distribu- 
tion, and since any error terms in the model (except the nijt S) 
interdependent, disturbance structure (A) is more general than that 


may be 


typically assumed (specifically, that error structure in which the 
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elements of the disturbance term e are independent, identically distributed 
Normal random variables, each with mean zero and variance 5°). But it is 
not a generalization of this latter error structure: the latter is not a 
special case of disturbance structure (A). This is so since disturbance 
structure (A) is based on a certain lack of specific information or 
knowledge about the characteristics of the components of the disturbance 
term. As a consequence, if the error structure which one wishes to assume 


Түзүү 


is not that specified by disturbance structure (A), then B = (X'VX)- 
1s not necessarily the best linear unbiased estimator for g in Y = Xg + 
бо €. 

This latest observation leads into the proper subject of this 
section: a consideration of a common case in which B is not the best 
linear unbiased estimator for 8. For consistency of approach, suppose 
that the model is written in the form Y = X8 + €, where any unobserved 
or unobservable explanatory variables (if any), which were previously 
included in Z, are now included in the disturbance terme. As has been 
seen, B = (X'VX) lx'vy is the best linear unbiased estimator for 8 when 
£ obeys disturbance structure (A). Consider the asymptotic properties 
of the matrix V in three cross-sectional dimensions. As the number of 
categories, I, J, and T, in each cross-sectional dimension goes to 


infinity, the elements of V behave as follows: 


| eo 
(1-1) (9-1) (1-1) _ Dolo JE 
ОЯ 1 1 ^ 
Тег RENS 
-(I-1)(J-1 "nc CMS NEED US 
ee wp v] NE 
TT 
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[Note that when q cross-sectional dimensions are considered, the number 
of unique elements in V is 2 since each element of V depends on the 
comparison of the subscripts of two random variables, each of which has 


q subscripts. These two random variables may either agree or disagree 


3 


in each subscript. For q = 3, then, V has 2” = 8 unique elements. ] 


That is, the diagonal elements of V approach unity and all other elements 


of V approach zero. Or, as I,J, and T increase without bound, V tends 


] 


to the identity matrix. As a consequence, (X'VX) 'X'VY approaches 


(X'X) ХҮ as I, J and T become infinitely large. Hence, in the case 
that e obeys disturbance structure (A), the ordinary least squares 


] 


estimator 8 = (X'X)"'X'Y is in the limit (in I, J and T) an unbiased 
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estimator for 8, since it is the limit of a sequence of unbiased 


estimators . This suggests that, for sufficiently large I, J and T, 
the ordinary least squares estimator for 8, 8 = (xy ty could serve 
to approximate the best linear unbiased estimator B when disturbance 
structure (A) holds. This line of thought will not be pursued: it is 
the converse suggestion, that B can serve to approximate 8 for sufficiently 
large I, J and T, that is more interesting here. Suppose that the 
transformation V was inappropriately applied to the model Y = Xg + e. 
Specifically, suppose that the components of e are independent, identic- 
ally distributed Normal random variables with mean zero and variance к 
Call this disturbance structure (B). Then the ordinary least squares 
estimator à - x97 bey is the best linear unbiased estimator for 8. 
Note that B = Guy ew is still an unbiased estimator for 8, but it 
is no longer best. But since V approaches the identity matrix as I, J 
and T increase, the less efficient estimator B approaches (X'x) lx'y 

as well. This suggests a pragmatic comparative scheme for the two 


estimators B and 8: 


Ion treating a subject related to that considered here, Wallace and 


Hussain [9] have shown the asymptotic equivalence of the Aitken 
estimator and an estimator derived under a linear transformation 
(much as B was derived from the linear transformation V) for a 
particular error structure. In the disturbance structure considered 
in their paper, the disturbance term was assumed to be a sum of 
independent random variables (in a combined time series and cross- 
section analysis), 

Ea, = a. + ү} m Dag» for which Ela.) = E(y,) = E(n.4) = 0 


E? 2 2 я 
апа Var(a;) т Ош О (ү) E Var (n...) = c4 for all i, t, 
where 22, 25, апа of were known. 
The paper also showed the equivalence of the iterative Aitken esti- 
mator and the estimator derived under a linear transformation for 
the disturbance structure as above with 


2, 25, апа of unknown. 
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1. Suppose disturbance structure (A) applies. Then 8 is biased, 
and B is the best linear unbiased estimator and should reasonably 
be used. 

2. Suppose on the other hand that disturbance structure (B) is 
assumed to hold. Then 8 and B are both unbiased estimators, 
although B is less efficient than 8. But note that B has an 
advantage which may offset (on a case-by-case basis) its lesser 
efficiency: it guarantees to purge all random variables which 
are invariant over at least one cross-sectional dimension. That 
is, if one is unsure of the validity of the seen Buen that dis- 
turbance structure (B) holds, then one may see some value in 
applying the transformation V in order to rid the model of all 
such possible sources of error. 

Two concluding observations should now be made. First, it is clear 
that application of the transformation V is equally inappropriate in all 
other cases where disturbance structure (A) does not hold in the model 
Ү = Хв + е. An important special case is that in which the generalized 
least squares estimator for ß is appropriate. Just as the ordinary 


1 


least squares estimator 8 = (X'X) X'Y is the best linear estimator for 


g when Ele) = O and Var (e) = с I, the Aitken estimator & = (X' a) y! 
X. ӨТҮ 1s the best linear unbiased estimator for g for the case in 


which Ele) = O and Var(e) = cg. 


Finally, it is worth repeating the crucial condition which underlies 
the specification of the case in which the transformation V is effective. 
In the model Y 7 X8 * Za + єє (ог іп the equivalent, under the trans- 
formation V, model Y = X8 * e, where the variables in Z are thrown into 
the disturbance term ce) V is effective in removing unobserved or unobserv- 


able variables (stochastic or deterministic) only if these variables 
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are invariant over at least one cross-sectional dimension. Accordingly, 
all work in this paper is performed under the assumption that each 
variable in X (those variables which vary over all cross-sectional 


dimensions) has been observed. 


K. INTERPRETATION OF TERMS UNDER THE TRANSFORMATION ү 


Consider the model in the form Y » X8 + e, in three cross-sectional 


dimensions. The equation representing the data in the i 


th 


category of 


the first cross-sectional dimension, the j^ category of the second 


dimension and the gen category of the third dimension is Yijt = Xijt B + 
Cast: where Xist is a k-vector of observations on the k explanatory 
variables in X. The categories of the cross-sectional dimensions corres- 


ponding to the observations Yist and Xist may be considered to be 


"treatments" which affect the values of the observations of y. d 


ай 
1Jt 
th ucell". With this in mind, assume that each 


X mihe (i, j, t) 


ist 
and Xiit 
single treatments (here i, j, t represent the "treatments"), two-way 


y can be represented as a sum of common mean, effects due to 


JC 


interaction effects of pairs of treatments, and a three way interaction 
effect of the three treatments. [Note that since there is only one 


observation (on each of Wee and Xie) per "cell", it is generally not 
possible to discern between the effect of the three-way interaction term 


and the error term et 


way interaction term does in fact exist. That this is so can be seen as 


In this case, however, it is known that a three- 


follows: since E is deterministic, one can calculate the exact three- 


| i 1-9 MM = X. . = ER 
way interaction effect for cell (i, j, t) as ЕЕ 
X. X Р +X ¿-X  , subject only to roundoff error (this express- 


ion is the same as that of a sample estimate of the three-way interaction 


effect for the case of stochastic x. 


5342: This is not identically zero 
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(for all cells), by previous hypothesis about the variables in X, so a 


three-way interaction effect is present. And since у; is a linear 


ans 


function of x. also includes a three-way interaction effect.] 


ues A 
That is, that: 


m .. +E. ! К‏ ج 
Yijt ^? * Gt * A; + В. * C, t Dii FE, t+ ET * Edit‏ 
O EO О 0 0 0 0‏ _ 
Xijt 7 9 i O BTC Di; Eo + Ej‏ 
where t and E are the three-way interaction terms mentioned above.‏ 


Substituting these into the model: 


= J + O. 


ijt + A; + B + C+ + D. - +E., +F., +e.. = 


Yijt J 1t jt ijt 


0 О О О О О О 
(u + ш + А; + B; T C. + Di; + Est + Fit ) 8 + 


Визе 


SET ne 
These effects can be equated term by term to give: 
О 


H A 
А; = Ав 
B, = 2 
C, = CyB 
055 7 Dj 38 
Eig 7 Ej 
NES zu 
and: I = ee (*) 
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Now consider the data under the transformation V: VY = V X 8 + үе. In 


th 


the (i, j, t) cell this gives: 


e cr y SS 
iit i E OC > 


х = Xe 


- d + . - 
1J IJ. TE 2 ШЕ: 2 e x) p 


2 1. «Jn bat 


+ (Ve);;+ , where (Ve);;+ is the (1. E element of Ve. 


Note that the left hand side of this equation is the sample estimate 
of the three-way interaction term Oist' And the term in parentheses on 


the right hand side is the three-way interaction term ó. This is the 


jt’ 
relationship specified in (*) above, with a sample estimate for Ф131 


replacing ae and with a disturbance term (Ve) included. That is, 


ijt 


and X. 


under the assumption that y, m 


ijt can each be represented as a 


Sun of common mean, effects due to single treatments, two-way interaction 
effects of pairs of treatments, and a three-way interaction effect, it is 


true that Oist - Bist 
sample estimate of the three-way interaction term ist 


interaction term jr This is precisely what the estimator B = ev 


8. Hence ß can be estimated by regressing the 


on the three-way 


X'VY accomplishes. 
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